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Abstract 

For the solution of discrete ill-posed problems, in this paper a novel 
preconditioned iterative method based on the Arnoldi algorithm for 
matrix functions is presented. The method is also extended to work 
in connection with Tikhonov regularization. Numerical experiments 
arising from the solution of integral equations and image restoration 
are presented. 
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1 Introduction 

In this paper we consider the solution of ill-conditioned linear systems 

Ax = b, 

in which we assume A e M^''^ to be full rank with singular values that 
gradually decay to 0. As reference problems we consider the linear systems 
arising from the discretization of Fredholm integral equation of the first kind 
(commonly referred to as discrete ill-posed problems [12]), where A repre- 
sents the discretization of a compact operator. Most of the arguments here 



*Corrisponding author. Email adresses: novati@math.unipd.it (P. Novati), 
Michela.RedivoZaglia@unipd.it (M. Redivo-Zaglia), mrrusso@math.unipd.it (M.R. 

Russo) 



1 



presented can also be applied to certain saddle point problems (see e.g. 
or even Vandermonde type systems arising from interpolation theory (see 
e.g. [in])- For important applications, involving for instance Vandermonde 
type systems, b is assumed to be error-free. On the other hand, working 
with discrete ill-posed problems one typically assumes the right-hand side b 
affected by noise. In this paper we consider both cases, taking into account 
the two possible situations. 

In this framework, it is well known that many Krylov type methods such 
as the CG and the GMRES possess certain regularizing properties that allow 
to consider them as effective alternative to the popular Tikhonov regulariza- 
tion method, based on the minimization of the functional 

J(x,A) = px-bf + A||i/x||% (1) 

(||-|| denoting the Euclidean vector norm) where A > is a given parameter 
and if is a regularization matrix (see e.g. |12] and [15] for a background). 
Indeed, since most of Krylov methods working with A or A^A initially pick 
up the largest singular values of A, they can be interpreted as regulariza- 
tion methods in which the regularization parameter is the iteration num- 
ber m. We may refer to the recent paper [3J and the reference therein for 
an analysis of the spectral approximation properties of the Arnoldi-based 
methods and again [12] §6 for the CG-like methods. Anyway, in the frame- 
work of discrete ill-posed problems, Krylov subspace methods also present 
some important drawbacks. First of all we may have semi-convergence, that 
is, the method initially converges but rather rapidly diverges. This phe- 
nomenon typically appears when the Krylov method is implemented with 
the re-orthogonalization of the Krylov vectors (as for instance in the case of 
the Matlab version of the GMRES, where the orthogonality of the Krylov 
basis is guaranteed at the machine precision by the use of the Householder 
transformations). In this situation, after approximating the larger singular 
values (oversmoothing) the method is also able to provide a good approx- 
imation to the smallest ones (undersmoothing) . This allows to reach the 
maximum accuracy, attained for a certain rriopt, but at the same time a reli- 
able stopping criterium needs to be used to avoid divergence. On the other 
side, if a Krylov method is implemented without re-orthogonalization it is 
typically not able to produce good approximation of the smallest singular 
values. After say fn iterations (normally with m < rriopt, hence in a situation 
of oversmoothing) multiple or spurious approximations of the smallest sin- 
gular values typically appears because of the loss of orthogonality, and the 
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iteration stagnates around x^:. In this situation a valid stopping rule is no 
more so crucial but unfortunately the attainable accuracy is generally much 
poorer than the one obtained by the same method with re-orthogonalization. 
We refer to [12] §6.7 for an exhaustive explanation about the influence of re- 
orthogonalization in some classical Krylov methods. 

In order to overcome these problems, in this paper we present a new 
method that can be referred to as a preconditioned iterative solver in which 
the preconditioner is {A + A/) or {A^ A -\- XH^ H) . In detail, in the noise- free 
case, the method is based on the solution of the regularized system 

{A + A/) xa = b, 

and then on the computation of the solution x as 

X = /(A)XA, (2) 

where f{z) = 1 + Xz~^, using the standard Arnoldi method for matrix func- 
tions based on the construction of the Krylov subspaces with respect to A 
and Xa, that is, Km{A,:x.x) = spanjx^, Ax^, A™~^xa}. The method can 
be viewed as a preconditioned iterative method, since f{A) = A~^{A + XI). 
We have to remember that a regularization of the type {A + Ai/)xA = b has 
been considered by Franklin in [11] when A is SPD. 

It is worth noting that, with respect to standard preconditioned Krylov 
methods, in our method only one system with the preconditioner has to be 
solved so reducing the computational cost. Moreover it is important to point 
out that for problems in which the singular values of A rapidly decay to 0, 
as those considered in this paper, each Krylov method based on A shows 
a superlinear convergence (see [21] Chapter 5). For our method, this fast 
convergence is preserved since we still work with A for the computation of 
([2]) (see Section [3] for details). As we shall see, this idea, i.e., first regularize 
then reconstruct, will allow to solve efficiently the problem of divergence 
without loosing accuracy with respect to the most effective solvers. 

The method can be extended to problems in which the right hand side b is 
affected by noise just considering as preconditioner the matrix (A^A+XH^H) 
(cf. (PP)). As before the idea is to solve the system 

{A^A + XH^H)^x = A^h, 

and then to approximate the solution x by means of a matrix function eval- 
uation 

/(Q)xA = (A^A)-' {A^A + XH^H)^x, 
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where / is as before and Q = {H^H) {A^A). 

We need to point out that we could unify the theory taking H = I for 
the noise- free case, and hence work always with the Krylov subspaces with 
respect to the matrix Q. However, since A is ill-conditioned, for evident rea- 
sons, we prefer to consider two separate situations. Thus, we shall denote by 
ASP (Arnoldi with Shift Preconditioner) and ATP (Arnoldi with Tikhonov 
Preconditioner) the two approaches, respectively, for noise-free and noisy 
problems respectively. 

Besides the stability and the good accuracy, there is a third important 
property that holds in both cases: the reconstruction phase, that is, the 
matrix function computation, allows to select initially the parameter A even 
much larger (heavy oversmoothing) than the one considered optimal by the 
standard parameter-choice analysis (L-curve, Discrepancy Principle,..., see 
[12j for a background), without important changes in terms of accuracy. 
In this sense the method can be considered somehow independent of the 
parameter A (see the filter factor analysis presented in Section H]). 

We remark that the idea of using matrix function evaluations to improve 
the accuracy of the regularization of ill-conditioned linear systems has al- 
ready been considered in [3]. Anyway it is important to point out that the 
approach here presented is completely different since, as said before, only one 
regularized system needs to be solved. Indeed, in |5j the authors considers 
approximations belonging to the Krylov subspaces generated by {A + A/)~^ 
or [Al'" A + H)~^ (rational Krylov approach) that requires the solution of 
a regularized linear system at each Krylov step. Here we consider polynomial 
type approximations. 

The paper is structured as follows. In Section |2] we provide a background 
about the basic features of the Arnoldi method for matrix functions and we 
present the methods (ASP and ATP) studied in the paper. In Section [3] 
we analyze the error of the ASP method, providing also some consideration 
about the error of both methods in inexact arithmetic. In Section H] we 
analyze the filter factors of the methods. In Section [5] we present some 
numerical experiments, and a test of image restoration is shown in Section 
ini Some final comments are given in Section [71 
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2 The ASP and the ATP methods 



As already partially explained in the introduction, the ASP method approx- 
imates the solution of the ill-conditioned system Ax = b in two steps, first 
solving in some way the regularized system 

(A + A/)x, = b, (3) 

and then recovering the solution x from the system 

(A + A/)"Mx = xa, 

that is equivalent to compute 

X = /(A)x, (4) 

where 

fiz) = l + \z-\ (5) 

Independently of the way we intend to approximate x from (jlj), this repre- 
sents a novel approach because contrary to standard preconditioned iterative 
methods here only one linear system with the preconditioner needs to be 
solved. Of course this is possible because of the special preconditioner we 
are using but, in principle, the idea can be extended to any polynomial pre- 
conditioner. 

For the computation of /(y4)xA we use the standard Arnoldi method (or 
Lanczos in the symmetric case) projecting the matrix A onto the Krylov sub- 
spaces generated by A and x^, that is Km{A, x^) = spanjx^, v4xa, A'"~^xa} 
For the construction of the subspaces Km{A, x^), the Arnoldi algorithm gen- 
erates an orthonormal sequence {vj}^.>-,^, with Vi = Xa/||xa||, such that 
Km{A,yix) = spanjvi, V2, Vm} (here and below the norm used is always 
the Euclidean norm). For every m, in matrix formulation, we have 

(6) 

where Vm = [vi, V2, Vm], Hm is an upper Hessenberg matrix with entries 
hi J = vfAvj and is the j-th vector of the canonical basis of M™. 
The m-th Arnoldi approximation to x = /(A)xa is defined as 

x^ = ||xa|| VmfiHm)ei, (7) 
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(see [15] and the references therein for a background). For the computation 
f{Hjn), since the method is expected to produce a good approximation of 
the solution in a relatively small number of iterations (see Section |3]), that 
is for m <^ N, one typically considers a certain rational approximation to /, 
or, as in our case, the Schur-Parlett algorithm, [16] Chapter 9. 

We denote by ASP method the iteration ([7j) independently of the method 
chosen for solving Starting from Vi = x^/ ||xa||, at each step of the 
Arnoldi algorithm, we only have to compute the vectors = Avj, j > 1. 
Below the algorithm used to implement the method. 



ASP Algorithm 



Require A E M^^^, b G M^, A G M+ 

Define f{z) = 1 + Xz'^ 

Solve {A + XI) xa = b 

Vi ^ Xa/||xa|| 

for m = 1,2,... do 

Wm ^ A\rn 
7 T 

V ^ Wm - X;r=l ^fc.mVfc 
hm+l,m ^ ||v|| 
Vm+1 ^ ^ / hm+l,m 

Compute f{Hm) by Schur-Parlett algorithm 

Xm ^ ||xA||Kn/(-f^m)ei 

end for 



In the above algorithm, the Arnoldi method is implemented with the 
modified Gram-Schmidt process. Therefore, as is well known, the theoretical 
orthogonality of the basis is lost quite rapidly and consequently the method 
is not able to pick up the singular values clustered near 0. For this reason 
at a certain point during the iteration ([7]) the method is no longer able to 
improve the quality of the approximation and it stagnates, typically quite 
close to the best attainable approximation, and almost independently of the 
choice A (see Section [5]). 

Regarding the attainable accuracy (assuming that the seed xa is not 
affected by error), by the definition of / it depends on the the conditioning 
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of {A + A/)~^ A. Denoting by k(-) the condition number with respect to the 
Euchdean norm, theoretically the best situation is attained defining A such 
that 

K{A + XI) = K{{A + Xiy^A), (8) 

that is, the condition number of the preconditioner is equal to the condi- 
tion number of the preconditioned system. It is quite easy to prove (see e.g. 
[5J) that in the SPD case taking A = \/AiAjv, where Ai and Xn are respec- 
tively the smallest and the largest eigenvalue of A, we obtain k,{A + XI) = 
K{{A + Xiy^A) = y<A). 

The preconditioning effect of A + XI of course depends on the choice of 
A. By dH]) it is necessary to find a compromise between the preconditioning 
and the accuracy in the solution of the systems with A + XI. In this sense 
formula (|8]), that theoretically represents the optimal situation also implicitly 
states a lower bound for the attainable accuracy. Indeed, many numerical 
experiments arising from the discretization of Fredholm integral equation of 
the first kind, in which we have examined the behavior of some classical 
Krylov methods such as the GMRES and the CG preconditioned with A + 
XI, have revealed that we can substantially improve the rate of convergence 
(taking A ~ 1/ ■>/ k,{A), see again [5] for a discussion) but we are not able to 
improve the accuracy over a certain limit. 

The ASP method can be extended to problems in which the exact right 
hand side b is affected by noise. Anyway, since in presence of noise a good 
approximation of the exact solution may be meaningless, we extend the idea 
using the classical Tikhonov regularization. Moreover, many experiments 
have shown that the ASP method generally produces poor results for problem 
with noise. 

We assume in particular to know only a perturbed right-hand side b = 
b + eb, where eb is the perturbation. Given A > and H G M^^^ such that 
H^H is non singular, for approximating the solution of Ax = b we solve the 
regularized system 

{A^A + XH^H)^x = A^h. (9) 
and then we approximate x by computing 

X = {a^a)-\a^a + xh^h)^^ 

= /(Q)XA (10) 

where / is defined by ([5]) and Q = [H^H) ^[A'^A). As before, for the 
computation of /((5)xa we use the standard Arnoldi method projecting the 
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matrix Q onto the Krylov subspaces generated by Q and x^. Now, at each 
step we have have to compute the vectors Wj = Qv^, j > 1, with vi = 
xa/ ||xa||, that is, to solve the systems 

This means that we actually do not need Q explicitly. The algorithm is 
almost identical to the one given for the ASP method, apart from the two 
steps inserted in a box. 



ATP Algorithm 



Require A E M^^^, b G M^, A G M+ 
Define f{z) = 1 + Xz'^ 

Solve {A'^A + XH'^H)^x = A'^h 
Vi ^ Xa/||xa|| 
for m = 1, 2, ... do 

Solve (H^H) = {A^A}^ 

hk,m ^ ^k^m 
hm+l,m ^ ||v|| 

Compute f{H„i) by Schur-Parlett algorithm 

X^ ^ ||xA||Kn/(^^m)ei 

end for 



This kind of approach is somehow related with the so-called iterated 
Tikhonov regularization (see for instance [15] or [22]), with the important 
difference that now only one regularized system has to be solved. 

Remark 1 It is worth noting that the matrix Q is H -symmetric, that is, 
for each v, w G 

[H^HQifw = (H^HQ) w = v^A^Aw. 
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Because of this property the ATP method can be symmetrized using the Lanc- 
zos process based on this new metric. While this approach is promising be- 
cause computationally less expensive, some preliminary experiments have re- 
vealed that it is also quite unstable and, in general, less accurate than the 
ATP method. For this reason the analysis presented in the next sections does 
not regard this symmetric variant, that we plan to consider in a future work. 

3 Error analysis 

In exact arithmetic the error of the ASP method is given by := x — 
Xm where x^ is defined by ([7]). If we denote by n^-i the vector space of 
polynomials of degree at most m — 1, it can be seen that 

Xm = Pm^l(^)xA, (H) 

where x^ is the solution of (E]) and Pm-i G n^^i interpolates, in the Hermite 
sense, the function / at the eigenvalues of Hm, the so-called Ritz values. Ex- 
ploiting the interpolatory nature of the standard Arnoldi method, we notice, 
as pointed out also in [9], that the error can be expressed in the form 

Em = ||xa|| gmiA)qmiA)vi, Vi = Xa/ ||xa|| , (12) 

where 

(see also pJ,J), and 

^ f{z) -Pm~i{z) 

det{zl — Hm) 

From (|T2l) . a bound for \\Em\\ can be derived working with the field of values 
of A, defined as 

F{A) :=|^,yGC^\{0}|. 

Indeed, we can state the following result (see also the recent papers [1] and 
[8] for a background about the error analysis of the standard Arnoldi method 
for matrix functions). 
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Proposition 2 Assume that F{A) C C+. Then 



m 



i^™ii<^^n^^-^M, (13) 



i=l 



where a > is the leftmost point of F{A) and 2 < < 11.08. In the 
symmetric case we can take K = 1. 

Proof. From [B], we know that 

Ikml^)!! < K max \gm{z)\ , 



with 2<K< 11.08, and hence by ([T2]) 

ll^mll < ^ ||xa|| max \gmiz)\ \\qm{A)vi\\ . 

z£F(A) 

Now gm{z) is a divided difference that can be bounded using the Hermite- 
Genocchi formula (see e.g. [7]), so that 



\gm[z)\ < — r max 

ml ^eco{z,a{H^)} 

A 

< max 



d'" A 



where co {z, a{Hm)} denotes the convex hull of the point set given by z and 
a{Hm)- Since (j{Hm) C F{Hm) ^ F{A), by some well known properties of 
the Arnoldi algorithm, and using the relation 



i=l 



that arises from (see [20]), the result follows. ■ 

Since a, the leftmost point of F{A), can be really small for the problems 
we are dealing with, formula (fT3|) can surely be considered too pessimistic 
with respect to what happens in practice. However, the upper bound given 
by fll3p allows to derive some important information about the behavior of 
the error. First of all, it states that the rate of convergence is little influenced 
by the choice of A, and this is confirmed by the analysis given in Section H] 
and by the numerical experiments. Secondly, it states that, independently 
of its magnitude, the error decay is related with the rate of the decay of 
YYiLi hi+i,i- We need the following result (cf. [21] Theorem 5.8.10). 
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Theorem 3 Let aj and Xj, j > 1, be respectively the singular values and the 
eigenvalues of an operator A. Assume that \Xj\ > and 

cr^ < oo for a certain < p < 1. (14) 
Let s^miz) = YYll^{z - Xi) . Then 

V m / 

where 

1 + p^ 

p ^-^ ■' 

^ i>i 

Of course, the hypothesis ( fT^ is fulfilled by many problems arising from 
the discretization of integral equations, in many cases with p quite small. 
Now, using the relation (|23j p. 269), 

m 

Whi+i^i < \\Sm{A)\\\ 
i=l 

that holds for each monic polynomial Sm of exact degree m, we can say that 
Theorem [3] reveals that for discrete ill-posed problems the rate of decay of 
rii^i is superlinear and depends on the p-summability of the singular 
values of A, i.e., on the degree of ill-posedness of the problem (cf. [17] Def. 
2.42). 

In computer arithmetics, we need to assume that x^, solution of ([3]) is 
approximated by with an accuracy depending on the choice of A and the 
method used. In this way, the Arnoldi algorithm actually constructs the 
Krylov subspaces Km{A,"x.\). Hence the error can be written as 

= ||/(A)x,-||x,||V;,/(i7jei||< 

||/(A)XA - IIxaII V^f{H,r.)ei\\ + \\f{A) (XA - xa)|| . (15) 

The above formula expresses the error in two terms, one depending on the 
accuracy of the Arnoldi method for matrix functions and one on the accuracy 
in the computation of xa. Roughly speaking we can state that for small values 
of A, f{A) ^ I (cf. (jlj) and we have that ||-E'm|| ~ ||xa — xa||- This means 



11 



that the method is not able to improve the accuracy provided by the solution 
of the initial system. For large A we have that ~ xa because the system 
(|3]) is well conditioned, but even assuming that ||/(v4) (xa — xa)|| ~ that in 
principle may happen even if ||/(y4)|| is large, we have another lower bound 
due the ill conditioning of f{A) = {A + XI) since now A + XI has a poor 
effect as preconditioner. 

Regarding the optimal choice of A we can make the following considera- 
tion. Unless the re-orthogonalization or the Householder implementation is 
adopted, the Arnoldi method typically stagnates around the best approxima- 
tion Xm because of the loss of orthogonality of the Krylov basis. Therefore 
let c(A) be such that 

||/(A)xa - ||xa|| K*/(if„)ei|| c(A) asm-^ N. 

Then by (fT5!) the optimal value of A depends on the method used to compute 
Xa and is given by 

Aopt = argmin (c(A) + \\f{A) (xa - xa)||) . (16) 

Of course the above formula is interesting only by a theoretical point of view. 
In practice, as mentioned in the introduction, one could try to compare the 
conditioning of A + XI and f{A), by approximating the solution of 

n{A + XI) = K{{A + Xiy^A). (17) 

with respect to A. However, since the computation of xa comes first, it is 
suitable to take A a bit larger than the solution of f|T7|) . Note that generally 
such solution can be approximated by A = 1/k{A). 

For the ATP method the analysis is almost identical since the error is 
given by 

Em ■= /(Q)xa - ||xa|| VmfiIIrr,)ei, 

where {A^A + XH'^H)xx = A'^h, {A^A + XH^H)5lx = A^h, and Q = 
{II^II)~^{A'^A). Hence, as before we have 

\\Em\\ < ||/(g)xA -p„-l(g)xA|| + WfiQ) (xa - Xa)|| , 

where Pm-i is again defined by f fTT]) . This expression is important since it 
states that theoretically we may take A very large, thus oversmoothing, in 
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order to reduce the effect of noise and then leaving to the Arnoldi algorithm 
the task of recovering the solution. Unfortunately, the main problem is that, 
as before, f{Q) may be ill-conditioned for A large. Henceforth, even in this 
case we should find a compromise for the selection of a suitable value of A, but 
contrary to the ASP method for noise-free problems it is difficult to design 
a theoretical strategy. Indeed everything depends on the problem and on 
the operator H. In most cases the noise on the right-hand side produces an 
increment of the high-frequency components of b, that are emphasized on the 
solution by the nature of the problem. For this reason H is generally taken 
as a high-pass filter, as for instance a derivative operator, and the solution of 
([T]) can be interpreted as a numerical approximation via penalization of the 
constrained minimization problem 

min 1 1 Ax — b 1 1 

||_ffx||=0 " " 

While in standard constrained minimization one approximates the solution 
taking A very large (theoretically A — )■ oo), in our case H is hardly able to 
detect efficiently the effect of noise on the numerical solution so that one is 
forced to adopt some heuristic criterium such as the L-curve analysis. In 
general terms we can say that if the solution is smooth and involves only low 
frequencies then a high-pass filter should lead to a good approximation taking 
A "large" . On the other side if the solution involves itself high-frequencies as 
in the case of discontinuities, then it is better to undersmooth the problem 
so reducing the effect of the filter. We have made these considerations just 
to point out that a general theoretical indication on the choice of A is not 
possible dealing with problems affected by error. What we can do is to derive 
methods able to reduce the dependence on this choice, and the ATP method 
seems to have some chances in this direction. 

4 Filter factors 

In order to understand the action of the second phase of the methods, i.e., 
the matrix function evaluation applied to the regularized solution (cf. (jl]) 
and (fTOj) ). below we investigate the corresponding filter factors. 

Assuming for simplicity that A is diagonalizable, that is, A = XDX~^ 
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where D = diag(Ai, Aat), for the ASP method we have 



1=1 

where Xj is the eigenvector associated with Aj, and {■)■ denotes the i-th 
component of a vector. After the first phase, the fiher factors are thus 
Qi = Aj (Aj + A)^^. Since from f lTTj) . we have = Pm-i{^)^\, where Pm-i 
interpolates the function / at the eigenvalues of Hm, we immediately obtain 

A»Pm-i(Ai) (^~^b).. 
A. + A A, 

Therefore, at the m-th step of the ASP method the filter factors are given 
by 

~ A, + A ' 

Let us compare, with an example, the behavior of the filter factors. Similarly 
to what was made in [12], we consider the problem GRAVITY taken from 
the Hansen's Regular izat ion Tools [131 [II]; with dimension = 12. In 
Figured], the filter factors gi and fl"^\ for m = 4,6,8,10 are plotted. As 



regularization parameter we have chosen A = l/^yK{A). Since the problem 
is SPD, for more clarity in the pictures, the eigenvalues A, have been sorted 
in decreasing order. 

While the problem is rather simple the pictures clearly represent the 
action of the Arnoldi (Lanczos in this case) steps. Since the Arnoldi (Lanczos) 
algorithm initially picks up the largest eigenvalues, it automatically corrects 
the filters corresponding to the low- middle frequencies {g^ — )■ f-'^^ ~ 1), keep 
damping the highest ones. The second phase thus performs a correction, but 
the properties of the Arnoldi algorithm guarantees that the method can still 
be interpreted as a regularizing approach. 

For a better explanation of Figure [H let us assume for simplicity that the 
Ritz values rj, j = 1, ...,m, are distinct (as in the example), so that we can 
write 

m 
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m=4 




m=8 




m=6 




m=10 




Figure 1: Filter factors Qi (asterisk) and f^"^^ (circle) with m — 4,6, 8, 10, for 
GRAVITY(12). 

where Ij, j = 1, ...,m are the Lagrange polynomials. Hence we obtain 

h l^h^^^) x, + \' ' 

Since the Arnoldi algorithm ensures that Vj ^ Xj for j = l,...,m we have 
f^'^^ fti 1 for i < m. For i > m and when Aj ~ we have that 

/f^^P^-i(O) ^' 



\ + A' 



so that the filters are close to the ones of the uncorrected scheme. Of course, 
numerically, the problems start to appear when the Arnoldi algorithm fails to 
provide good approximations of the eigenvalues of A, but it is important to 
observe that, at least in exact arithmetics, the choice of A only influences the 
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high frequencies. For this reason, at least for the ASP method, this choice is 
more related to the conditioning of the subproblems (cf . Section [3]) . 

The filter factor analysis just presented remains valid also for the ATP 
method. Taking H = I in (Q and using the SVD decomposition we easily 
find that the filter factors are now given by 

Am) ^ (^fPm~l{(yf) 

o\ + A 

and hence our considerations for the ASP method remains true also for this 
case. Of course for if 7^ / we just need to consider the GSVD. For problems 
with noise, the choice of A is of great importance. Anyway we have just seen 
that the correction phase allows to reproduce the low frequencies indepen- 
dently of this choice. In this sense, in practice we can take A even very large 
in order to reduce as much as possible the influence of noise. 

5 Numerical experiments 

This section is devoted to the numerical experiments obtained on a single pro- 
cessor computer Intel Core Duo T5800 with Matlab 7.9. Our goal is to prove 
numerically what we consider the valuable properties of the ASP and the 
ATP methods, that is, accuracy and speed comparable with the most effec- 
tive iterative solvers, stability, and reduced dependence on the parameter A. 
For the experiments we consider problems taken from the Regularization 
Tools Matlab package by Hansen [131 [H]- Our comparison method is the 
Matlab version of the GMRES, that is implemented with the Householder 
algorithm that guarantees the orthogonality of the Krylov basis at the ma- 
chine precision. For the problems here considered the GMRES method has 
shown to be the most accurate, if compared to other well known methods 
that we can found in the literature. Since it is also quite unstable, it is 
generally implemented together with the discrepancy principle as stopping 
criterium (where it is possible of course), but not always with good results. 
We point out that the modified Gram-Schmidt version of the GMRES has 
also been considered in the experiments (even if not reported); this version is 
stable, but unfortunately the attainable accuracy loses one or even two order 
of magnitude with respect to the version implemented by Matlab. Other 
methods such as the CGLS and LSQR are widely inferior for the problems 
here considered. 
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In all experiments the Arnoldi algorithm for the ASP and the ATP meth- 
ods, as said in Section O is implemented with the modified Gram-Schmidt 
orthogonalization, and the initial linear system is solved with the LU or the 
Cholesky factorization. 
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1.37 X 10-5(9) 
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Table 1: Results for BAART(240) in the noise-free case. 

As first test problem we consider BAART(240) (in parenthesis, as usual, 
we indicate the dimension A^). The estimated condition number of the cor- 
responding matrix A is around 10^°. We first consider the noise-free case 
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Figure 3: Error behavior of the GMRES and the ATP method with A = 1 
and A = 10^° for BAART(240) with Gaussian noise. 

comparing the behavior of the ASP method with GMRES, taking different 
values of the parameter A. Looking at Figure [2] we can observe that even con- 
sidering a wide range of values for A, contrary to GMRES the ASP method 
does not suffer from semi-convergence, that is, the error always stabilizes 
around the minimum. The attainable accuracy is always quite close to the 
one of GMRES. The number of iterations necessary to achieve the minimum 
accuracy is almost always the same, as expected from Proposition [2] and it 
depends on the spectral properties of the operator, that is, on the fast decay 

of nili^i+i.i (cf- Theorem ED . 

Another important observation can be made looking at the error curve 
corresponding to the choice of A = 10~^^ (line with asterisks). Since this curve 
is almost flat we argue that this value of A is probably very close to the value 
Aopt defined by f lT6|) . that seeks for a compromise between the accuracies 
in the solutions of the initial linear system and in the computation of the 
matrix function. In Table [T] the minimal errors (with the iteration numbers 
in parenthesis) and the corresponding residuals are reported. 
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error residual A 

ATP 4.00 X 10-^(2) 2.70 x 10"^ 

6.01x10-^(4) 2.17x10-^ 10^ 

GMRES 5.66 x lO^^js) 2.16 x 10"^ 



Table 2: Results for BAART(240) with Gaussian noise. 

Now we consider the same problem with right-hand side affected by noise. 
We try to solve Ax = b working with an inexact right-hand side b = b + Gb 
where eb is a noise vector of the type 

S 1 1 b 1 1 

eb = u, (18) 



where we define 6 = 10^^ as the relative noise level, and u = randn(N, 1), 
that in Matlab notation is a vector of random components with normal 
distribution with mean and standard deviation 1. For the ATP method, 
we define H as the discrete second derivative operator, that is. 



H 



/ 2 -1 \ 

■•. ■•. -1 

V -12/ 



and we choose A = 1 and A = lO^*'. The comparison is made again with the 
GMRES. The error curves are plotted in Figure [31 For A = 1 the method does 
not provide a substantial improvement to the first iteration that correspond 
to the Cholesky solution of the Tikhonov system. Probably this is due to 
the fact that A = 1 is close to the value attainable with the L-curve analysis. 
Anyway it is important to notice that the method does not deteriorate that 
approximation during the iteration. For A = 10^° we have an effective and 
stable improvement with a good accuracy if compared with the one of the 
GMRES. In order to avoid confusion in the pictures we only consider these 
two values, since in the internal range the curves are similar, showing the 
robustness of the method with respect to the choice of the parameter A. The 
results are reported in Table [2j 

For a fair comparison between the ASP method and GMRES we also 
consider the preconditioned version of this code that we denote by PGMRES 
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with the same preconditioner used by the ASP method, that is, A + XI. 
Working again with BAART(240) with exact right-hand side, in Figure H] 
we plot the error curves with respect to the computational cost. While 
a flops counter is no longer available in Matlab, it is quite easy to derive 
these numbers knowing the algorithms. The non-vectorial operations are 
neglected. For both methods the systems with A + XI are solved by means 
of the LU factorization, computed only once at the beginning. Of course, 
each PGMRES iteration is more expensive since it requires the solution of a 
system with A + XI. 




Figure 4: Error behavior of the preconditioned GMRES and the ASP method 
for BAART(240) with A = lO^^ (left) and lO"^ (right). 
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13 


10-^ 
10-^ 



Table 3: Comparison between the ASP method and the PGMRES for A = 
10-5,10-^. 

The results reported in Figure H] reveal that the ASP is still competitive 
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with the PGMRES in terms of accuracy and computational cost. For this 
example the PGMRES is a bit faster than GMRES (cf. Figure [2]) since the 
error curve is steeper at the beginning, but it remains unstable. Comparing 
also the results of these examples (Table [3]) with the ones reported in Table 
[H we also observe a very little improvement in terms of accuracy. 

1 p 

BAART 
FOXGOOD 

\ SHAW 
° " \ GRAVITY 




X 

Figure 5: Maximum attainable accuracy with respect to the choice of A. The 
dimension of each problem is = 160. 

In a final example we want show the behavior of the methods in four classi- 
cal problems (BAART, FOXGOOD, SHAW and GRAVITY), with N = 160, 
changing the value of the parameter A. Figure [5] is representative of what 
happen in general for the ASP method with exact right-hand side, that is, as 
expected, the attainable accuracy is generally poor for small values of A (the 
initial system is badly solved) and for large values of A (the preconditioning 
effort is poor). In any case it is really important to observe that the maxi- 
mum accuracy can be obtained without much differences for a relatively large 
window of values for A, since the curves exhibit a plateau around the mini- 
mum. Indicatively, we may say that the maximum accuracy can be achieved 
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taking A in a range between 1/ a/ k{A) and 1/ ^ k{A). The importance of this 
behavior is not neghgible because it means that having an estimate of the 
conditioning of A allows to skip any pre-processing techniques to estimate 
the optimal value of A. 




Figure 6: Maximum attainable accuracy with respect to the choice of lambda 
with right-hand side affected by noise. The dimension of each problem is 
= 160. 

Assume now to work with a right-hand side affected by noise, b = b + eb, 
where eb is a defined by (fT8|) with noise level 6 = 10^^. Looking at Figure [6l 
we can observe that with respect to the noise-free case we do not even have 
the problem of oversmoothing taking A too large, at least for the example 
considered. We argue that the bottleneck, for what concerns the accuracy, is 
represented by the effect of noise. In general, increasing the value of A leads 
to a slight increase of the number of iterations. These considerations leads us 
to state a general strategy for an automatic parameter-choice implementation 
of the method. 

1. define A relatively "large", for instance even much larger than the point 
of maximum curvature of the L-curve; 
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2. use any parameter-choice method for m to define the stopping rule (as 
for instance the discrepancy principle where possible), allowing some 
more iterations to avoid oversmoothing (m too small, cf. Figure [3]). 

Concluding we may say that for the ATP method of course there exists 
an optimal value of A, say Aopt, close to the corners of the L-shaped curves 
of Figure El and a corresponding mopt, that is, the minimum number of 
iterations to achieve the optimal accuracy. Anyway, our experiments reveal 
that working with A > Aopt and m > mopt, we do not have a sensible loss of 
accuracy nor a remarkable increase of computational cost. 

6 An example of image restoration 

In this section we consider a problem of image restoration. The example is a 
2D image deblurring problem which consists of recovering the original n x n 
image from a blurred and noisy image. The original image is denoted by 
X and it consists of n x n grayscale pixel values. Let x = vec {X) G M'^, 

= n^, be the vector whose entries are the pixel values of the image X. 
Let moreover A G M^^^ be the matrix representing the blurring operator, 
coming from the discretization of the Point Spread Function (PSF). The 
vector b = Ax represents the associated blurred and noise-free image. We 
generate a blurred and noisy image b = b + eb, where eb is a noise vector 
defined by ([18]) with 5 = IQ-^. 

The matrix A is a symmetric Toeplitz matrix given by 



where T is a n x n symmetric banded Toeplitz matrix whose first row is a 
vector V whose element are 



The parameter q is the half-bandwidth of the matrix T, and the parameter 
cr controls the width of the underlying Gaussian point spread function 




for j = 1, ...,g 
for j = q + 1, ...,n 
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which models the degradation of the image. Thus, a larger a implies a wider 
Gaussian and thus a more ill-posed problem. For our experiments X is a 
100 X 100 subimage of the image coins .png from Matlab's Image Processing 
Toolbox, shown as the first image in Figure [3 We define g = 6 and a = 1.5, 
so that the condition number of A is around 10^". We report the results of our 
image restoration using two different regularization operators. In particular 
we consider the matrix 



Hi ^2 



I® Hi 
Hi® I ' ' 



where Hi 



( 1 



v 



taken from ^18j (with a slight modification in order that Hf2Hi,2 is nonsingu- 
lar), and the matrix if2,2 defined as the discretization of the two-dimensional 
Laplace operator with zero-Dirichlet boundary conditions, that is, 

/ 4 -1 -1 \ 

-1 4 -1 -1 



H. 



2,2 



v 



4 -1 4 -1 

-1 -1 4 



^NxN 



Figure [7] shows that the ATP method can be fruitfully used also for these 
kind of problems. Due to the well marked edges, the original image involves 
high-frequencies so that the restoration by means of the standard derivative 
operators is intrinsically difficult, because they are high-pass filters. 

Table S] shows that also for this kind of problems the attainable accuracy 
is weakly infiuenced by the choice of A. 



1 10^ 10^ 10^ 
Hi^2 0.060 0.060 0.062 0.059 
H2,2 0.061 0.064 0.069 0.075 

Table 4: Attainable accuracy (Euclidean norm of the error) for the image 
restoration with H12 and if2,2 using different values of A. 
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Original Image Blurred and noisy Image 




Figure 7: Image restoration with the ATP method using Hi 2-, H2.2 and 
A = 100. 

7 Conclusions 

In this paper we have presented a new approach for the solution of discrete 
ill-posed problems. The basic idea is to solve the problem in two steps: 
first regularize and then reconstruct. We have described two methods based 
on this idea, the ASP method that is actually a particular preconditioned 
iterative solver, and the ATP method that is a method that tries to improve 
the approximation arising from the Tikhonov regularization. In both cases 
the reconstruction is performed evaluating a matrix function by means of the 
standard Arnoldi method. This idea can also be interpreted as a modification 
of the iterated Tikhonov regularization (see for instance p5] and [22]). 

Being iterative, both methods should be interpreted as methods depend- 
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ing on two parameters, that is, A and the number of iterations m. Actually 
our implementation of the Arnoldi method (Gram-Schmidt) is very stable 
so that for a fixed A, the undersmoothing effect, theoretically determined by 
taking m large, in general does not deteriorate the approximation. Therefore 
the only important parameter is A. Anyway, the most important property of 
both methods is that they do not need an accurate estimate of this parameter 
to work properly (cf. Section HJ Figures E] and O and Table 4). Of course this 
property is particularly attractive for problems in which a parameter-choice 
analysis is too expensive or even unfeasible as for instance for large scale 
problems such as the image restoration. 

As possible future developments, we observe that the ASP method could 
be quite easily extended to work in connection with polynomial precondition- 
ers (see e.g. [2] for a background). This can be done replacing {A + XI)~^ 
with a suitable Pm{A) ~ A~'^ and changing accordingly the matrix function 
to evaluate. Also the symmetric version of the ATP method (see Remark [1]) 
seems quite interesting and requires further investigation. 

Finally, we want to point out that the present paper was just intended 
to present the basic ideas and properties of the methods; in this sense, a 
reliable implementation with stopping criterium, choice of A, etc., has still 
to be done. 
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